AITopics | label hierarchy

Deep Neural Networks inherit spurious correlations embedded in training data and hence may fail to predict desired labels on unseen domains (or environments), which have different distributions from the domain to provide training data. Invariance Learning (IL) has been developed recently to overcome this shortcoming; using training data in many domains, IL estimates such a predictor that is invariant to a change of domain. However, the requirement of training data in multiple domains is a strong restriction of using IL, since it demands expensive annotation. We propose a novel IL framework to overcome this problem. Assuming the availability of data from multiple domains for a higher level of classification task, for which the labeling cost is lower, we estimate an invariant predictor for the target classification task with training data gathered in a single domain. Additionally, we propose two cross-validation methods for selecting hyperparameters of invariance regularization, which has not been addressed properly in existing IL methods. The effectiveness of the proposed framework, including the cross-validation, is demonstrated empirically. Theoretical analysis reveals that our framework can estimate the desirable invariant predictor with a hyperparameter fixed correctly, and that such a preferable hyperparameter is chosen by the proposed CV methods under some conditions.

invariance learning, name change, training data, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification

Neural Information Processing SystemsOct-9-2025, 07:15:17 GMT

We investigate the problem of reducing mistake severity for fine-grained classification.

machine learning, natural language, prediction, (14 more...)

Neural Information Processing Systems

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Learning a Concept Hierarchy from Multi-labeled Documents

Viet-An Nguyen, Jordan L. Ying, Philip Resnik, Jonathan Chang

Neural Information Processing SystemsOct-2-2025, 22:27:11 GMT

In this paper, we capture the dependence among the labels using a learned tree-structured hierarchy.

hierarchy, node, topic model, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)
Asia > Middle East > Jordan (0.05)
(4 more...)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Add feedback

Invariance Learning based on Label Hierarchy

Neural Information Processing SystemsAug-18-2025, 16:01:11 GMT

Deep Neural Networks inherit biased correlations embedded in training data and hence may fail to predict desired labels on unseen domains (or environments), which have different distributions from the domain to provide training data. Invariance Learning (IL) has been developed recently to overcome this shortcoming; using training data in many domains, IL estimates such a predictor that is invariant to a change of domain. However, the requirement of training data in multiple domains is a strong restriction of using IL, since it demands expensive annotation. We propose a novel IL framework to overcome this problem. Assuming the availability of data from multiple domains for a classification task at a higher level, for which the labeling cost is lower, we estimate an invariant predictor for the target classification task with training data gathered in a single domain. Additionally, we propose two cross-validation methods for selecting hyperparameters of invariance regularization, which has not been addressed properly in existing IL methods. The effectiveness of the proposed framework, including the cross-validation, is demonstrated empirically. Theoretical analysis reveals that our framework can estimate the desirable invariant predictor with a hyperparameter fixed correctly, and that such a preferable hyperparameter is chosen by the proposed CV methods under some conditions.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

8cf04c64d1734e5f7e63418a2a4d49de-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-16-2025, 21:41:12 GMT

benchmark, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(3 more...)

Add feedback

Learning a Concept Hierarchy from Multi-labeled Documents

Viet-An Nguyen, Jordan L. Ying, Philip Resnik, Jonathan Chang

Neural Information Processing SystemsFeb-9-2025, 04:41:54 GMT

While topic models can discover patterns of word usage in large corpora, it is difficult to meld this unsupervised structure with noisy, human-provided labels, especially when the label space is large.

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > North Korea (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)
(22 more...)

Industry:

Government > Military (1.00)
Law (0.94)
Government > Regional Government > North America Government > United States Government (0.93)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.50)
(2 more...)

Add feedback

A Transformer-based Autoregressive Decoder Architecture for Hierarchical Text Classification

Yousef, Younes, Galke, Lukas, Scherp, Ansgar

arXiv.org Artificial IntelligenceJan-23-2025

Recent approaches in hierarchical text classification (HTC) rely on the capabilities of a pre-trained transformer model and exploit the label semantics and a graph encoder for the label hierarchy. In this paper, we introduce an effective hierarchical text classifier RADAr (Transformer-based Autoregressive Decoder Architecture) that is based only on an off-the-shelf RoBERTa transformer to process the input and a custom autoregressive decoder with two decoder layers for generating the classification output. Thus, unlike existing approaches for HTC, the encoder of RADAr has no explicit encoding of the label hierarchy and the decoder solely relies on the label sequences of the samples observed during training. We demonstrate on three benchmark datasets that RADAr achieves results competitive to the state of the art with less training and inference time. Our model consistently performs better when organizing the label sequences from children to parents versus the inverse, as done in existing HTC approaches. Our experiments show that neither the label semantics nor an explicit graph encoder for the hierarchy is needed. This has strong practical implications for HTC as the architecture has fewer requirements and provides a speed-up by a factor of 2 at inference time. Moreover, training a separate decoder from scratch in conjunction with fine-tuning the encoder allows future researchers and practitioners to exchange the encoder part as new models arise. The source code is available at https://github.com/yousef-younes/RADAr.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/FAIA240661

2501.13598

Country: